admissible neural heuristic
Review for NeurIPS paper: Learning Differentiable Programs with Admissible Neural Heuristics
Summary and Contributions: This work considers the problem of synthesizing programs from input/outputs, but where some of the components of the program might have continuous parameters, and where the entire program is differentiable with respect to these parameters. Neurosymbolic programs are a special case of this set up (symbolic programs which can call out to neural modules if needed). This is an especially challenging combinatorial search problem, because not only do we have to consider an infinitely large, discrete space of program structures, but we also have to consider an inner-loop optimization over continuous parameters. The approach they take is to perform an explicit symbolic graph search over the discrete space of partial programs. As a heuristic function for this graph search, they train neural networks to approximate the behavior of incomplete portions of the program syntax tree.
Review for NeurIPS paper: Learning Differentiable Programs with Admissible Neural Heuristics
Generating programs has been a long-standing problem in AI for many decades. Reviewers found valuable the fact that this approach combines prior literature on heuristic search with a modern neural networks approach to improve performance. Reviewers also found that methods which combine discrete and continuous parts of programs are in short supply, making this of wide interest and likely to spur further research. The fact that the approach is in a sense straightforward conceptually but not obvious while being able to perform when more complex methods like TerpreT are not suitable was also pointed out as a significant advance. Reviewers wished to see more related to the interpretability of the acquired models.
Learning Differentiable Programs with Admissible Neural Heuristics
We study the problem of learning differentiable functions expressed as programs in a domain-specific language. Such programmatic models can offer benefits such as composability and interpretability; however, learning them requires optimizing over a combinatorial space of program "architectures". Our key innovation is to view various classes of neural networks as continuous relaxations over the space of programs, which can then be used to complete any partial program. All the parameters of this relaxed program can be trained end-to-end, and the resulting training loss is an approximately admissible heuristic that can guide the combinatorial search. We instantiate our approach on top of the A* and Iterative Deepening Depth-First Search algorithms and use these algorithms to learn programmatic classifiers in three sequence classification tasks.